Stochastic modified equations for the asynchronous stochastic gradient descent
نویسندگان
چکیده
منابع مشابه
Asynchronous Accelerated Stochastic Gradient Descent
Stochastic gradient descent (SGD) is a widely used optimization algorithm in machine learning. In order to accelerate the convergence of SGD, a few advanced techniques have been developed in recent years, including variance reduction, stochastic coordinate sampling, and Nesterov’s acceleration method. Furthermore, in order to improve the training speed and/or leverage larger-scale training data...
متن کاملAsynchronous Decentralized Parallel Stochastic Gradient Descent
Recent work shows that decentralized parallel stochastic gradient decent (D-PSGD) can outperform its centralized counterpart both theoretically and practically. While asynchronous parallelism is a powerful technology to improve the efficiency of parallelism in distributed machine learning platforms and has been widely used in many popular machine learning softwares and solvers based on centrali...
متن کاملAsynchronous Stochastic Gradient Descent with Delay Compensation
With the fast development of deep learning, people have started to train very big neural networks using massive data. Asynchronous Stochastic Gradient Descent (ASGD) is widely used to fulfill this task, which, however, is known to suffer from the problem of delayed gradient. That is, when a local worker adds the gradient it calculates to the global model, the global model may have been updated ...
متن کاملThe Convergence of Stochastic Gradient Descent in Asynchronous Shared Memory
Stochastic Gradient Descent (SGD) is a fundamental algorithm in machine learning, representing the optimization backbone for training several classic models, from regression to neural networks. Given the recent practical focus on distributed machine learning, significant work has been dedicated to the convergence properties of this algorithm under the inconsistent and noisy updates arising from...
متن کاملAdaptive wavefront control with asynchronous stochastic parallel gradient descent clusters.
A scalable adaptive optics (AO) control system architecture composed of asynchronous control clusters based on the stochastic parallel gradient descent (SPGD) optimization technique is discussed. It is shown that subdivision of the control channels into asynchronous SPGD clusters improves the AO system performance by better utilizing individual and/or group characteristics of adaptive system co...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Information and Inference: A Journal of the IMA
سال: 2019
ISSN: 2049-8764,2049-8772
DOI: 10.1093/imaiai/iaz030